The statistical theory of linear selection indices from phenotypic to genomic selection

Abstract A linear selection index (LSI) can be a linear combination of phenotypic values, marker scores, and genomic estimated breeding values (GEBVs); phenotypic values and marker scores; or phenotypic values and GEBVs jointly. The main objective of the LSI is to predict the net genetic merit (H), which is a linear combination of unobservable individual traits’ breeding values, weighted by the trait economic values; thus, the target of LSI is not a parameter but rather the unobserved random H values. The LSI can be single‐stage or multi‐stage, where the latter are methods for selecting one or more individual traits available at different times or stages of development in both plants and animals. Likewise, LSIs can be either constrained or unconstrained. A constrained LSI imposes predetermined genetic gain on expected genetic gain per trait and includes the unconstrained LSI as particular cases. The main LSI parameters are the selection response, the expected genetic gain per trait, and its correlation with H. When the population mean is zero, the selection response and expected genetic gain per trait are, respectively, the conditional mean of H and the genotypic values, given the LSI values. The application of LSI theory is rapidly diversifying; however, because LSIs are based on the best linear predictor and on the canonical correlation theory, the LSI theory can be explained in a simple form. We provided a review of the statistical theory of the LSI from phenotypic to genomic selection showing their relationships, advantages, and limitations, which should allow breeders to use the LSI theory confidently in breeding programs.


INTRODUCTION
published a paper called "A discriminant function for plant selection," in which he described a statistical method to select parents for the next selection cycle based on a linear combination of several joint quantitative traits. That method makes it possible to improve several traits that differ in additive variability, heritability, economic importance, and in the correlation among their phenotypes and genotypes in the plant and animal breeding context (Hazel et al., 1994). Hazel and Lush (1942) called that selection method "total score," whereas Hazel (1943) called it "selection index," and Cerón-Rojas and Crossa (2018) called it "linear phenotypic selection index" (LPSI). Smith (1936) made a clear distinction between the LPSI (discriminant function) and the net genetic merit (H). The LPSI is a linear combination of observable phenotypic records, and H is an unobservable linear combination of trait breeding values (g) weighted by the trait economic values (W).
The main assumptions made by Smith (1936) to develop the LPSI theory are as follows: the vector of genotypic values g (the average of the phenotypic values across a [large] population of environments, or the genetic constitution of an organism or cell) is composed entirely of additive effects of genes and thus is the plant or animal breeding value, and H is the total individual genotypic value (Hazel & Lush, 1942;Kempthorne & Nordskog, 1959). Finally, the vectors of adjusted means of phenotypic values (y) and H have a joint multivariate normal distribution. By those assumptions, the joint distribution of H and LPSI is bivariate normal, whereas the conditional distribution of H given LPSI (and the conditional distribution of LPSI given H) is univariate normal. Under those assumptions, the regression of H on y is linear (Kempthorne & Nordskog, 1959), and it is possible to write the H and y values as a multiple linear regression model, where H and y are the dependent and independent variables, respectively, and both are random. In this model, the LPSI is the conditional expectation of H given y. Then, to predict H, one must find the values of the LPSI regression coefficients so that the LPSI values may best discriminate those individuals that have the highest H values (Smith, 1936). Smith (1936) wrote his paper before the Eisenhart (1947) article, which highlighted that there are two fundamentally different explanatory variables, called "fixed" and "random" effects. One important result of this distintion is that whereas fixed effects influence only the mean of y, random effects influence only the variance of y and thus the LPSI and its parameters. Assuming that the population mean of H and y is zero, the selection response is the conditional expectation of H given the LPSI values, whereas the expected genetic gain per trait (or multi-trait selection response) is the conditional expectation of g given the linear selection index (LSI) val-

Core Ideas
• Linear selection indices (LSIs) are useful in plant and animal breeding. • The main LSI objective is to predict the net genetic merit of individuals. • We did a complete review of the statistical theory of the LSI. • We described the bases of genomic LSI.
ues (Cochran, 1951;Kempthorne & Nordskog, 1959;Smith, 1936). The estimates of these last parameters and the correlation between H and the LPSI are the main criteria to validate and compare the efficiency of any LSI.
Some authors (Finney, 1962;Hazel & Lush, 1942;Young, 1961) compared the LPSI efficiency with the independent culling method, where breeders sequentially select for a series of independent traits in the same population, and the tandem selection method, which involves sequential selection on a series of traits where repeated cycles of selection for a single trait are practiced until a desirable expression of the trait is reached. Those authors found that the LPSI selection response was higher than the selection response for the other two selection methods. Thus, under certain conditions (e.g., normality), the LPSI is the best option for selection.
The LPSI theory has been extended, and now a LSI can be a linear combination of marker scores (Lande & Thompson, 1990), genomic estimated breeding values (GEBVs) (Cerón-Rojas & Crossa, 2019;Ceron-Rojas et al., 2015), phenotypic values and marker scores (Lande & Thompson, 1990), and phenotypic values and GEBVs jointly (Dekkers, 2007). The last two indices are called "combined LSIs" in this review. An additional distinction is between single-stage and multistage LSIs (Cerón-Rojas & Crossa, 2020a;Cerón-Rojas et al., 2019a, 2019bCochran, 1951;Young, 1964;Xu & Muir, 1992). Multi-stage LSIs are methods for selecting traits at different times or stages of the crop, such as the age at which breeders can measure and select the trait of economic interest and are used to save money and time because these indices do not need to have a large number of individuals in the selection process.
The main objectives of all LSIs are to predict H and select parents for the next generation; thus, the target of study of any LSI is not a parameter but rather the unobserved random H values. Additional objectives of LSIs are to maximize the selection response, the expected genetic gain per trait, and the correlation with H. By imposing restrictions equal to some predetermined values on the expected genetic gain per trait, several authors (Harville, 1975;Kempthorne & Crop Science Nordskog, 1959;Mallard, 1972;Tallis, 1985) developed the constrained LPSI (CLPSI) theory. In addition, by applying the LPSI and CLPSI theories in the context of genomic selection, Ceron-Rojas et al. (2015) and Cerón-Rojas and Crossa (2019) developed the unconstrained and constrained linear genomic selection indices (LGSI and CLGSI, respectively), which are linear combinations of GEBVs. Cerón-Rojas et al. (2016b) have shown that the CLPSI vector of coefficients is a linear combination of the LPSI vector of coefficients or a projection of the unconstrained LPSI vector of coefficients to a different space.
All the LSIs described above assume that the economic weights are known constants; however, for several reasons, this assumption is rarely fulfilled (Vandepitte & Hazel, 1977). Furthermore, several authors (Hazel, 1943;Lin, 1978;Yamada et al., 1975) have indicated that there is no one general method for assigning economic weights to the traits because those economic weights can change from time to time or vary from one location to another in breeding programs. For this reason, modified indices, such as the base index (Williams, 1962), modified base index (Smith, 1983), and non-weighted multiplicative index (Elston, 1963), have been proposed in plant and animal breeding. However, those authors did not show that the latter LSIs maximize the correlation with the net genetic merit. In addition, it can be difficult to obtain the selection response and the expected genetic gain per trait for those indices.
The preceding problems led Cerón-Rojas et al. (2008a, 2016a to develop the eigen selection index method (ESIM), the restricted ESIM (RESIM), and the predetermined proportional gain ESIM (PPG-ESIM) in the canonical correlation context (Hotelling, 1936). The ESIM is an unrestricted index, but the RESIM and PPG-ESIM allow imposing null and predetermined restrictions, respectively, on the expected genetic gains of some traits, while the rest remain without restrictions. These indices do not require economic weights when predicting H and making selections because they use the first eigenvector of the matrix of multi-trait heritabilities (in the ESIM case) or the first eigen vector of a linear transformation of the matrix of multi-trait heritabilities (in the RESIM and PPG-ESIM case) in the prediction. Cerón-Rojas et al. (2008b) adapted the ESIM to the marker context and developed an index (molecular ESIM [MESIM]) similar to the linear molecular selection index (LMSI), whereas Cerón-Rojas and Crossa (2018) adapted ESIM to the genomic selection context and developed a combined genomic eigen selection index method (genomic ESIM [GESIM]), similar to the Dekkers (2007) index.
All indices associated with the LPSI theory (unconstrained and constrained; phenotypic, genomic, multi-stage, etc.) are particular cases of the best linear predictor (BLP) theory, which is based on the minimum mean square prediction error (MSPE), because this approach allows us to obtain the unique best MSPE predictor (Bickel & Doksum, 2015). The MSPE is the expected squared deviation between the true value and the estimate, and it provides an effective measure of proximity to the true parameter of interest (Carlin & Louis, 2000). On the other hand, all indices associated with ESIM are based on the canonical correlation analysis (CCA) theory (Hotelling, 1936). Searle et al. (2006) have indicated that the best predictor (BP), BLP, and the best linear unbiased predictor (BLUP) are associated as follows. When all the parameters of the joint distribution of H and LSI are known, BP is available, whereas for BLP only the first and second moments are assumed to be known, and for BLUP the second but not first moments are assumed to be known. Under multivariate normality, BP and BLP are the same (Bickel & Doksum, 2015;Christensen, 2011;Harville, 2018); however, in practice, those approaches need estimates of their parameters. In this last case, the LSI could not be BLP (Searle et al., 2006). Nevertheless, with good estimates of the phenotypic and genotypic covariance matrices (e.g., by restricted maximum likelihood [REML] estimate), the BLP (or LPSI) is close to the BLUP (Christensen, 2011;Henderson, 1990).
Several authors (Baker, 1986;Bramscap, 1984;Dekkers & Settar, 2004;Hazel et al., 1994;Henderson, 1963;Lin, 1978;Moreau et al., 2007;Nordskog, 1978;Phillipsson et al., 1994) have given partial reviews of the LSIs theory, such as the unconstrained and constrained LPSI and the LMSI, among others. In addition, using minimum mathematical rigor, Cerón-Rojas and Crossa (2018) wrote a user manual for breeders based on the LSI theory, and they gave many illustrative practical examples. In the current study, we reviewed the statistical theory of the LSIs from the unconstrained and constrained LPSI to the unconstrained and constrained LGSI. That is, based on the BLP theory, in the present work, we unified all the statistical theory of the LSIs associated with the LPSI (e.g., CLPSI, LGSI, CLGSI, etc.) from phenotypic to genomic selection. This requires greater mathematical rigor; as such, it is not possible to provide as many illustrative examples as Cerón-Rojas and Crossa (2018) did. In addition, based on the CCA theory, we present a unified theory of the ESIM, RESIM, and PPG-ESIM.
This review has four main parts. The first describes two earlier approaches to the unconstrained LSI theory and a description of the BLUP theory. The second describes the unconstrained LSI theory, including the LPSI, multi-stage LPSI, LGSI, and combined LSI. Similarly, the third part contains the constrained LSI that includes the CLPSI, constrained multi-stage LPSI, and CLGSI. Finally, in the fourth part, we describe ESIM, RESIM, PPG-ESIM, and their relationships with the LPSI and CLPSI. The MESIM and GESIM are only variants of ESIM and the combined LSIs, which can be found in Cerón-Rojas and Crossa (2018). The main objective of this review is to introduce researchers to the statistical theory of Crop Science the LSIs. Our intended audience for this study is mainly plant breeders and graduate-level students who are interested in LSI theory and who come from animal and plant breeding programs.
In subsection "Numerical example: LPSI vs ESIM," we provide an illustrative example using the LPSI and ESIM theory. In subsection "RIndSel," we introduce RIndSel (Alvarado et al., 2018;Pacheco et al., 2017;Perez-Elizalde et al., 2014), an R-software that is useful for estimating the LPSI, LGSI, CLPSI, ESIM, etc., parameters and for selecting individual candidates as parents for the next selection cycle. Furthermore, RIndSel is completely automated, which eliminates the need to learn a specialized syntax for anyone who is a novice user of R-software. This means that the users only need to learn how to introduce their data into the program and how to interpret the RIndSel results. This software, and a complete RIndSel user manual, can be downloaded from https: //data.cimmyt.org/dataset.xhtml?persistentId=hdl:11529/ 10854.

NOTATION
We use uppercase letters to denote random variables and lowercase letters to denote their particular realizations. For example, X is a random variable, whereas x (X = x) is a particular realization of X. Vectors are denoted with bold lowercase letters (e.g., g and y) and matrices with bold uppercase letters (X). Sometimes it is necessary to denote vectors with bold capital letters; for example, E denotes the vector of expected genetic gain per trait. The expectation of the random variable X[E(X)] is scalar, whereas the conditional expectation of X given Y = y [E(X|Y = y)] is a function of y. Thus, before we observe Y, the value of E(X|Y = y) is unknown, so it is a random variable that we denote as E(X|Y). Pearson (1903) and Pearl and Surface (1909) were the first to describe the index theory before the Smith (1936) LPSI theory. As we shall see, Pearson (1903) predicted breeding values based on a linear mixed model, whereas Pearl and Surface (1909) used a nonlinear approximation to the index theory. In addition, whereas the Pearson (1903) index selects genotypes, the Pearl and Surface (1909) index selects phenotypes. This last index selects phenotypes because it does not maximize its correlation with the net genetic merit.

Pearson index
Suppose that X and Z are design matrices and that θ and g are vectors of fixed and random (e.g., breeding values) effects, respectively, associated to the trait of n individual candidates (y) to selection, and assume that y and g have multivariate normal (MVN) distribution, also denoted as "∼", i.e., y ∼ MVN and g ∼ MVN. Consider the following linear mixed model: where y ∼ MVN(Xθ, V) is a n × 1 vector of observations for one trait, Xθ and V = σ g 2 ZAZ′ + I n σ ε 2 are the unconditional expectation and variance of y, respectively, and var(g) = Aσ g 2 is the unconditional variance of g ∼MVN (0, Aσ g 2 ); var(y|g) = I n σ ε 2 is the conditional variance of the vector of residuals ε ∼MVN (0, I n σ ε 2 ), 0 denotes the null expectations of g and ε, A is the numerator relationship matrix (Lynch & Walsh, 1998;Mrode, 2014), and I n is an identity matrix of size n × n.
Based on the multivariate normal distribution theory, Pearson (1903) which now is called "single trait linear selection index" and where V −1 is the inverse matrix of the variance of y, whereaŝ = ( ′ −1 ) −1 ′ −1 (Xu, 2003). Henderson (1990) indicated that the above equation is one of the main Pearson (1903) results. Currently,̂= ( | )is called the BLUP of g and is one of the main tools to make genotypic selection in plant (Piepho et al., 2008) and animal (Mrode, 2014) breeding programs.
The BLUP properties have been developed by Henderson (1949Henderson ( , 1950Henderson ( , 1963Henderson ( , 1984Henderson ( , 1990. This author also showed that the BLUP and BLP (or LPSI) are similar; however, whereas the BLUP predicts vector g, the LPSI predicts a linear combination of g (i.e., the net genetic merit), as shown in Equations 2-4. In addition, Mrode (2014) indicated that BLUP reduces to LPSI when no adjustments for environmental factors are needed. This means that one of the main advantages of BLUP compared with BLP occurs when records have to be pre-adjusted for fixed or for environmental effects. When the environmental effects are random, Xu (2003) writes Equation 1 as where π is a vector of interaction genotypic-environment effects, W = X ⊗ Z, and ⊗ denotes the Kronecker product (Lych & Walsh, 1998). In this case, E(y) = 1 n μ, where 1 n is a vector of n ones, μ is the population mean, V = Xvar(γ)X′ + σ g 2 ZAZ′ + Wvar(π)W′ + I n σ ε 2 , and var(γ) and var(π) are Crop Science the variance of γ and π, respectively. Note that in this last case, nongenetic effects (included in γ), such as location and year effects, are treated as random effects, whereas in Equation 1 those effects are treated as fixed and were included in θ. For additional details, see Xu (2003).

Pearl and Surface index
The Pearl and Surface (1909) approach to the index theory is based on the index numbers theory (Ralph et al., 2015). By this reasoning, those authors called to their index "selection index numbers." The Pearl and Surface (1909) arguments to develop the selection index numbers are the same as those indicated by Smith (1936) in his paper. Pearl and Surface (1909) defined the selection index numbers as where y is a vector of traits of economic interest for the breeder that become more desirable as their values increase, and y* is a vector of traits of economic interest that become more desirable as their values decrease. In the above equation, ν 1 and ν 2 are vector of constants to be given arbitrary values in the proportions that the different variables are to be weighted. The Pearl and Surface (1909) index can be considered a nonlinear selection index and is does not maximize its correlation with the net genetic merit (Equation 2). As we have indicated in the Introduction, in this work we describe the LSIs associated to the LPSI theory and the ESIM index theory.

BLUP in the multi-trait context
Equation 1 is only useful to predict the individual breeding values associated to one trait. However, it is easy to extend Equation 1 to the multi-trait selection context, and it is possible to include in Equation 1 non-additive genetic effects, such as dominant genetic effects (or intra-locus dominance allelic interaction) and epistasis effects (or inter-loci allelic interaction). Multi-trait evaluation is the optimum methodology to evaluate plants and animals because this method increases the accuracy of predictions when it accounts for the phenotypic and genetic correlation between individual traits (Mrode, 2014 is an additive genetic covariance matrix, and ⊗ and A are as defined in Equation 1. In addition, var(y 1 , y 2 | g 1 , , and I n is as defined in Equation 1. It is possible to extend the above equation to any number of traits of economic interest (Mrode, 2014). For one trait, g 1 can be partitioned into additive genetic (a 1 ) effects (or intra-locus additive allelic interaction), dominant genetic (δ 1 ) effects, and epistasis (ι 1 ) effects, such that g 1 = a 1 + δ 1 + ι 1 . Epistasis refers to the interaction among additive and dominance genetic effects, for instance, additive by additive, dominance by dominance, additive by dominance, additive by additive by dominance, etc.; that is, ι 1 = aa 1 + dd 1 + ad 1 + . . .
For t traits, asumming additive and dominance effects, and epistasis effects such as additive by additive, dominance by dominance, and additive by dominance, the variance of g′ = [g′ 1 g′ 2 . . . g′ t ] can be written as where C is as defined earlier; C D is the covariance matrix of the dominant genetic interactions of effects between alleles within loci; C AA , C DD , and C AD are the covariance matrices of the epistasis effects among loci; and A, Θ, ⊗, and "•" are as defined above. Lynch and Walsh (1998) and Mrode (2014) Crop Science have given details of how Equation 1 can be written for additive and non-additive genetic effects jointly.
Although the above results seem very interesting, non-additive effects are not transmitted to offspring in noninbreeding populations (Lynch & Walsh, 1998); the breeding value (or additive genetic effects a 1 ) is the only component that can be selected and is thus the main component of interest in breeding programs (Mrode, 2014). Furthermore, in the index selection context, Andersson et al. (1998) found that the introduction of the dominance variance (e.g., 25% of the total phenotypic variance) has only a small positive effect (<1.5%) on the selection response; thus, to describe the LSI theory, the assumption that only additive effects are transmitted from generation to generation seems acceptable. This last asumption implies that the combined effects of all factors that make y′ = [y′ 1 y′ 2 . . . y′ t ] different from g′ = [g′ 1 g′ 2 . . . g′ t ], such as non-additive effects, are in the residual vector (Hazel & Lush, 1942). Mrode (2014) indicated that non-additive effects tend to be confounded with others, such as common maternal environment; thus, breeders should take care when they estimate non-additive effects. Du and Hoeschele (2000) described methods to estimate additive and non-additive variance components in the Bayesian context. In a similar manner, using linear mixed models theory, in the phenotypic and genomic selection context, Lynch and Walsh (1998) and Su et al. (2012), respectively, described methods to estimate additive and nonadditive variance components.
As we will see later, to predict H, the LPSI uses only matrix C and includes matrix A only for the estimation of C (Cerón-Rojas & Crossa, 2018). The LPSI is the BLP of H, not the BLUP of H (Bulmer, 1980;Cochran, 1951;Henderson, 1963). However, as we have indicated in Equation 1, the BLP (LPSI) and BLUP theory are related, but, as we have indicated earlier, whereas the BLUP predicts g′ = [g′ 1 g′ 2 . . . g′ t ], the LPSI predicts linear combination of the elements of g (see Equations 2-4).

Matrix P as a special case of matrix V
We have indicated that in the LSI context the assumption that only additive effects are transmitted from generation to generation is acceptable. For t traits, matrix V is equal to whereas matrices C and R are of size t × t. Here, matrix V is associated with the prediction of vector g′ = [g 1 ′ g 2 ′ . . . g t ′] (Equation 1). Now suppose that matrix Z* = I nt (where I nt is an identity matrix of size nt × nt) and that the level of inbreeding among individuals is null; then A = I n , and the matrix V can be written as

= +
As we shall see later, in this last case matrix V is associated with the BLP (LPSI) theory. Hereafter we denote it as P, which is the usual notation in the LPSI context.
By the foregoing results it is evident that, in the prediction of the net genetic merit (Equation 2), the LSI theory does not use matrix A. However, matrix A is used for the estimation of C (Cerón-Rojas & Crossa, 2018). This seems to be a disaventage of the LPSI theory with respect to the BLUP theory. As we have indicated earlier, BLUP predicts vector g′ = [g 1 ′ g 2 ′ . . . g t ′], not the net genetic merit; this means that BLUP is associated only with a part of the LPSI theory: that associated to the expected genetic gain per trait (see Equations 8 and 10), which also predict the mean values of g′ = [g 1 ′ g 2 ′ . . . g t ′]. Jeyaruban et al. (1995, Abstract, p. 1) compared the efficiency of BLUP and the LPSI and concluded that: "The relative selection response with the selection indices compared to the BLUP estimates were from 94.5 to 99.4% of BLUP. Inbreeding was higher in the BLUP selected populations, which could offset any advantage of BLUP."

The LPSI theory
In this section, we describe the LPSI theory and all the LSIs associated to this theory.

Basic conditions to construct a valid LPSI in the single-stage selection context
Conditions to construct a valid LPSI are as follows. (a) The phenotypic value is additively composed of two main parts: the genotypic value and the environmental contribution. (b) The genotypic value is the individual breeding value; thus, it is composed only of additive genes effects. (c) The net genetic merit of an individual is the genotypic economic value (Kempthorne & Nordskog, 1959). Crop Science opportunity to have offspring (Hazel & Lush, 1942). (h) The LPSI values in the c th selection cycle and the LPSI values in the (c + 1) th selection cycle are not correlated. (i) The correlation between the LPSI and the net genetic merit should be at its maximum in each selection cycle. These conditions are valid for all unconstrained or constrained LPSIs in the singlestage selection context. Cerón-Rojas and Crossa (2018) have given conditions for the unconstrained and constrained LGSI.

4.3
The net genetic merit be a vector of true unobservable genotypic random variables for t traits with multivariate normal distribution and null expectation. Then the q th (q = 1, 2, . . . , n; n = number of individuals) individual net genetic merit is Smith (1936) called Equation 2 "the genetic value of a plant or line." is a vector of random variables whereas in the BLUP equation described earlier, g′ = [g′ 1 g′ 2 . . . g′ t ] is a vector of particular realization of breeding values. Note that using the BLUP of g′ = [g′ 1 g′ 2 . . . g′ t ] (i.e.,̂′ = [̂′ 1̂′ 2 …̂′ ]), a predictor of Equation 2 is: ] is a vector of estimated net genetic merit values for n indviduals, and w j is the j th element of the vector of economic weights w (Dekkers, 1997;Portes et al., 2020). The predictor̂combines the BLUP and LPSI theory to predict the net genetic merit.

The net genetic merit and its relationship with y
For the q th individual, the relationship between Equation 2 and the vector of random phenotypic variables y′ where m H is the mean of H q , μ′ = [μ 1 μ 2 . . . μ t ] is a vector 1 × t of phenotypic means of y q , and e q is the deviation representing the extent to which β′(y q − μ) differs from the expected value of H q (Young, 1964). In Equation 3, is a vector of random variables, whereas in the BLUP equation described earlier, y′ = [y′ 1 y′ 2 . . . y′ t ] is a vector of particular realization of phenotypes values for t traits. In addition, the j th (j = 1, 2, . . . , t) element of vector μ is equivalent to Xθ (Equation 1) when X = 1 (a vector of 1s) and θ = μ j . We have assumed that e q has a normal distribution, null expectation, and variance σ e 2 and that the covariance between any pairs of e q and e r (q ≠ r) is zero. As we will see later, β′(y q − μ) = I q is the LPSI. It is possible to show that the random error (e q ) and the LPSI are uncorrelated (Bickel & Doksum, 2015). The variance of H q can be written as σ H 2 = w′Cw = σ I 2 + σ e 2 , where σ I 2 = β′Pβ is the variance of I q , and P = C + R is the phenotypic covariance matrix of y′ q = [Y q1 Y q2 . . . Y qt ]; C is as defined earlier, and R is the residual covariance matrix. All these last three matrices are of size t × t.
Equation 3 is a multiple linear regression model where the independent (y q ) and dependent (H q ) variables are random and have a joint multivariable normal distribution (Rencher & Schaalje, 2008). Nevertheless, whereas in the standard regression model the values of H q and y q are observables, in Equation 3 only the y q values are observable. For this reason, H q should be predicted using, for example, the BLP.

4.5
The BLP of H Hereafter we will denote the vector of phenotypic trait random variables as Y and its particular realization values as y; and, to simplify notation, we will denote H q and I q as H and I, respectively. Suppose that H and Y have a joint multivariate normal distribution and that we are looking for a linear predictor of H assuming that Y is the only information that we can use to predict the unobsevable random variable H. We want to find a predictor function of Y such that this predictor is "close" to H in the sense of the MSPE, which is a measure used in the mathematical theory of prediction (Bickel, 2015).
where β 0 = m H − β′μ, w′C and μ are as defined above, P −1 is the inverse of the matrix P, and β = P −1 Cw. Cochran (1951) was the first to indicate that the LPSI is the BLP of H. Equation 4 is unique and unbiased (Christensen, 2011;Searle et al., 2006) and represents a model that is linear in y as well as in β. In addition, it indicates that the problem of predicting H is simply estimating its conditional mean, whereas the LPSI variance is σ I 2 = β′Pβ. The assumption of multivariate normal distribution of H and Y is a sufficient condition for

Crop Science
Equation 4 to be linear; moreover, any function that is expressible as Equation 4 is regarded as linear even if the vector β depends on the joint distribution of H and Y (Bickel & Doksum, 2015;Harville, 2018). Smith (1936) called Equation 4 "discriminant function." By Equations 1-4, the random vectors g and Y have joint multivariate normal distribution with mean E(g) = 0 and E(Y) = μ, and covariance matrix var where P and C are as defined above. Moreover, the distribution of LPSI and H = w′g values is a bivariate normal distribution, with mean

E(H) = m H and E(I) = m I and covariance matrix var
, where β, w′, σ H 2 , and σ I 2 were defined earlier and σ HI = w′Cβ is the covariance between H and LPSI. As we will see later, because we can assume that β 0 = m H − β′μ is a fixed contant, the maximized LPSI expected genetic gain per trait (Equation 10), the maximized correlation between H and LPSI (Equation 11), and the maximized LPSI selection response (Equation 12), are not affected if in Equation 4 we assume that E(H) = 0 and E(Y) = 0. Likewise, when we select parents for the next generation, they will be the same in both cases, that is, when E(H) = 0 and E(Y) = 0 and when E(H) = m H and E(Y) = μ. Thus, we think that there are not loss of generality or loss of accuracy to describe the LSI theory if we assume that E(H) = 0 and E(Y) = 0. In practice, E(Y) = 0 only if y is centered with respect to μ.

4.6
The LPSI prediction error Because β = P −1 Cw, then σ HI = w′Cβ and σ I 2 = β′Pβ are the same, from where the variance of the LPSI prediction error is and σ HI 2 = (w′Cβ) 2 are, respectively, the square of correlation and the square of covariance between H and LPSI. We defined the other parameters earlier.
The conditional mean of H is dependent on y (Equation 4); however, the conditional variance of H (Equation 5) is independent of y. Thus, under the multivariate normal assumption, Equations 4 and 5 provide a linear model with constant variance (Rencher & Schaalje, 2008). Moreover, if the marginal density of LPSI is normal and if the conditional density of H given LPSI is normal for every LPSI value, the assumption of bivariate normality of LPSI and H is guaranteed (Arnold et al., 1999;Kagan & Wesolowski, 1996).

The intensity of selection in the single-stage context
Let I be transformed into a random variable T, as T = (I − m I )/σ I , where E(I) = m I and σ 1 is the standard deviation of I. Assume that all I values higher than I * value corresponding to any proportion of selected individuals (p) will be selected; then the variables T* = (I* − m I )/σ I and T are related to selection intensity (k) as is the height of the ordinate of the normal curve at the lowest value of τ* retained, and p is the proportion of the population of animal or plant lines that will be selected (Smith, 1936). In the selected population, Equation 6 is the uncoditional expectation of the random variable T [E(T)] weighted by p and was obtained by Smith (1936). It is possible to show that the variance of T (weighted by p), over the interval (τ*, ∞), is equal to 1 + k(τ* − k). Several authors (Hazel & Lush, 1942;Kempthorne & Nordskog, 1959;Xu & Muir, 1992;Young, 1964) have given additional details of Equation 6.

The conditional expectation of H and g given I
Suppose that the expectations of H and g are null, whereas the expectation of I is E(I) = m I ; then the conditional expectation of H and g given I is, respectively, where cov(H, I) = β′Cw is the covariance between H and I, cov(g, I) = Cβ is the covariance between g and I, T = (I − m I )/σ I , and σ I 2 = β′Pβ and σ 1 are the variance and standard deviations of I, respectively. In Equation 7, cov(H, I)/σ I 2 is the relationship between E(H|I) and I (or the regression of H on I), and cov(g, I)/σ I 2 is the relationship between E(g|I) and I in Equation 8.

4.9
The LPSI selection response, expected genetic gain per trait, and correlation By Equations 6-8, the LPSI selection response (R) and expected genetic gain per trait (E) are, respectively, where k was defined in Equation 6, σ = √ ′ is the standard deviation of H, and ρ HI = σ HI /σ H σ I is the correlation between H and LPSI. All the other parameters of Equations 9 and 10 were defined earlier. Smith (1936) obtained Equation 9, and Hanson and Johnson (1957) were the first to describe Equation 10. In Equation 9, R is a scalar; in Equation 10, E is a vector t × 1 (t = number of traits) of the expected genetic gain value of each trait. In addition, R is the expectation of H, and E is the expectation of g.
To maximize the selection response (Equation 9), Smith (1936) used differential calculus to derive Equation 9 with respect to β. The result reported by this author was the vector of regression coefficients β = P −1 Cw. He indicated that β = P −1 Cw should maximize the selection response. Thus, Smith (1936) did not use the BLP theory to develop the LPSI theory because using differential calculus is not necessary to obtain β = P −1 Cw in the BLP context (Equation 4).

4.10
The maximized correlation between H and I Because β = P −1 Cw, the maximized correlation between H and I is where all the terms were defined earlier. Equation 11 represents the proportion of the variance of H that can be attributed to the regression relationship with I. The covariance (σ HI ) and correlation are good measures of relationship only for variables with linear trends and are generally unsuitable for non-normal random variables with a curvilinear relationship (Rencher, 2002). That is, Equation 11 is a measure of linear dependency between H and I and is called the "multiple correlation coefficient," whereas ρ 2 max = σ 2 σ 2 is called "the coefficient of determination or squared multiple correlation" (Rencher & Schaalje, 2008). Note that while ρ HI = σ HI /σ H σ I can take any value in the interval −1.0 and 1.0, Equation 11 gives the maximum value of ρ HI , and when ρ HI = 0, H and I are independent.

4.11
The maximized selection response Once again, because β = P −1 Cw, the maximized LPSI selection response is Equation 12 predicts the mean improvement in H attributable to indirect selection on I only when β = P −1 Cw and is proportional to √ ' and k. Thus, whereas in Equation 9 the selection response can take any value, Equation 12 gives the maximum value of Equation 9. This is the main difference between the two equations.
Cerón-Rojas et al. (2006) showed that Equation 12 is related to the Cauchy-Schwarz inequality, which corroborates that β = P −1 Cw is a global minimum when the mean squared difference between I and H, E[(H − I) 2 ], is minimized, and a global maximum when the ρ HI correlation between I and H is maximized because Equation 12 is true only when β = P −1 Cw.

4.12
The maximized correlation, selection response, and expected genetic gains have the same form for all LSIs Equations 9-12 have the same form for all LSIs described in this review. Thus, as we will see later, the only difference between those equations and the selection response, expected genetic gain per trait, and correlation, associated with the other LSIs, is the type of information that the LSI vector of coefficients and the matrices contain.

Summary of the LPSI statistical properties
When H and I have a joint bivariate normal distribution; β = P −1 Cw; and P, C, and w are known, the statistical LPSI properties are the following: 1. Equations 10-12 are maximum. 2. The variance of I and the covariance between H and I are the same. 3. The maximized correlation between H and I is equal to Equation 11. 4. The LPSI is unbiased in the sense that E(I) = E(H) (Searle et al., 2006).  (Equation 9). That is, with truncation selection, Equation 4 maximizes the expected mean of the selected individuals (Cochran, 1951;Henderson, 1990). 8. The I values are appropriate to make selection, i.e., selecting the plants or animals with the highest predictions to be parents of the next generation (Searle et al., 2006). 9. The LPSI is the BLP of H in the sense that it (a) is a linear function of y; (b) is "best" in the sense that it has minimum mean squared error of prediction; and (c) is optimal and unique among the class of linear predictors based on y (Robinson, 1991;Searle et al., 2006). (2018) showed that all the LSIs associated with the LPSI have similar properties as those described in Points 1-9, as exemplified by CLPSI and CLGSI. This occur because the CLPSI and CLGSI vector of coefficients are linear combinations of the LPSI and LGSI vector of coefficients or projections of these vectors to a different space.

Multi-stage LPSIs
One stage is the animal or plant age at which breeders can measure and select the trait of economic interest. Cochran (1951) and Young (1964) combined the LPSI theory with the independent culling method and developed the optimum multi-stage LPSI (OMLPSI), which can include several stages in each selection cycle. The OMLPSI takes into consideration the correlation among indices at different stages when making selections and requires multiple integration techniques to derive selection intensities. In addition, after the first selection stage, the OMLPSI values could be non-normally distributed, and there are problems of convergence when the traits and the index values are highly correlated (Börner & Reinsch, 2012). Xu and Muir (1992) developed the decorrelated multi-stage LPSI (DMLPSI) as one possible solution to the OMLPSI problems. After the first selection stage, the DMLPSI values could be normally distributed, and this index does not require multiple integration techniques to derive selection intensities because it maximizes the correlation between the index and the net genetic merit at each stage under the restriction that the covariance between the DMLPSI values at different stages be zero. To obtain truncation points and selection intensities, Xu and Muir (1992) derived a set of nonlinear equations that can be solved iteratively using a multidimensional New-ton method. One problem associated with DMLPSI is that the selection responses and correlation with H are lower than the OMLPSI selection response and correlation after the first selection stage (Cerón-Rojas et al., 2019a, 2019b.

4.15
The OMLPSI As we have indicated earlier, there is no loss of generality if we assume that E(H) = 0 and E(y) = 0 to describe the LSI theory. Thus, to simplify notation, hereafter we will assume that E(H) = 0 and E(y) = 0. These last two assumptions are common in the multi-stage LPSI context (Xie & Xu, 1997, 1998Xu & Muir, 1992;Young, 1964). In practice, E(y) = 0 only if y is centered with respect to μ = E(y).
be a vector of random variables for t traits. We can partition this vector into S (t i < S < t, where t i is the number of traits at Stage i) subvectors as y′ = [ω′ 1 ω′ 2 . . .
] is a subvector of y at Stage i. Thus, at Stage S, = ∑ =1 is the total number of selected traits from vector y and, for each stage, the BLP (or OMLPSI) of H is where ω′* 2 = [ω′ 1 ω′ 2 ]. In Equation 13, the BLP changes at each stage; however, the form of this predictor is the same as that of the LPSI. At Stage i, let I i = β′ i ω i be the index, where ′ = [ β 1 β 2 … β ] is the OMLPSI vector of coefficients, and ω i and t i are as defined earlier; then we can construct the OMLPSIs as follows: This last result indicates that until Stage S − 1, each index is partial, but at Stage S, I S = β′ 1 ω 1 + β′ 2 ω 2 + . . . + β′ S ω S is a whole index. Young (1964) called to this procedure "the part and whole index selection," whereas Xu and Muir (1992) called it "selection index updating" because as traits become available, each subsequent index contains all traits available up to that current stage. The above equation is valid for any unconstrained or constrained multi-stage index.
w t ] is as defined earlier.

4.17
Maximized selection response, expected genetic gain per trait, and correlation at Stage i By Equation 14, the maximized selection response ( max ), expected genetic per trait ( max ), and correlation (ρ max ) at Stage i, respectively, are where k i is the selection intensity. For all stages, the total selection response and expected genetic gain per trait are = ∑ =1 max and = ∑ =1 max , respectively.
at Stage i) be subvectors of y until Stages i − 1 and i, and let t i−1 and t i−1 + t i be the size of vectors ω′ (i−1) and ω′ (i) , respectively; then the covariance matrix between ω′ (i−1) and ω′ (i) (Q (i−1)i ) will be of size t i−1 × (t i−1 + t i ) and can be written as Matrix Q (i-1)i is a nonsquare and nonsymmetric phenotypic variance matrix and is useful by making the DMLPSI and constrained decorrelated multi-stage LPSI (CDMLPSI) values independent among stages. By definition, a covariance matrix is symmetric; however, in this case, matrix Q (i-1)i is of size t i−1 × (t i−1 + t i ) because until Stages i − 1 there are t i−1 traits, whereas at Stage i there are t i−1 + t i traits. This means that at Stage i − 1 we have an index with t i−1 traits, whereas at Stage i we have an index with t i-1 + t i ; that is, the trait selected until Stage i − 1 and the traits selected until Stage i. By this reasoning, as we indicated earlier, Young (1964) called this procedure "the part and whole index selection," whereas Xu and Muir (1992) called it "selection index updating" because as traits become available, each subsequent index contains all traits available up to that current stage.

4.18.2
The DMLPSI vector of coefficients at Stage i Let I i−1 = b′ i−1 ω i−1 and I i = b′ i ω i be the DMLPSIs at Stages i -1 and i, respectively, and let J′ i−1 = [I 1 I 2 . . . I i−1 ] be a vector of DMLPSIs values until Stage i − 1. We need to obtain the DMLPSI vector of coefficients at Stage i such that the covariance between I i and J i−1 will be null [i.e., cov(I i , J i-1 ) = 0]. By this restriction, I i and I i−1 are not correlated; hence the name "decorrelated multi-stage linear phenotypic selection index." Xu and Muir (1992) where −1 is the inverse of matrix Q ii , and I i is an identity matrix of the same size as Q ii . When the restriction cov(I i ,J i−1 ) = 0 is not imposed, β i = b i , and at Stage 1 (i = 1), −1 is the inverse of matrix Q 11 −1 , Λ 1 = C 1 , and cov(ω 1 ,g) = C 1 . The method to obtain Equation 16 is valid for all selection indices associated with DMLPSI.
The DMLPSI expected genetic gain per trait, selection response, and correlation at Stage i are similar to those of Equation 15 changing β i for b i = K i β i . The only difference between the DMLPSI and OMLPSI is the way the selection intensities are obtained. Xu and Muir (1992) described a general method to obtain the DMLPSI selection intensity, whereas Cerón-Rojas et al. (2019a, 2019b) described a method to obtain the OMLPSI selection intensity in the twostage context. Methods to obtain the DMLPSI and OMLPSI selection intensities are valid for both the unconstrained and constrained multi-stage indices.

OMLPSI selection intensity
As we have seen in Equation 6, the selection intensity (k), in the single-stage context, is related to the height of the ordinate of the normal curve z(τ*) and the proportion selected (p) as k = [z(τ*)]/p. Kempthorne and Nordskog (1959) (Arismendi, 2013;Wilhelm & Manjunath, 2010). A truncated distribution is a conditional distribution resulting when the domain of the parent distribution is restricted to a smaller region (Hattaway, 2010). The equation k = [z(τ*)]/p is also called the "hazard function," the "hazard rate," or the "inverse Mill's ratio" of the normal distribution (Rausand & Hϕyland, 2004). Young (1964) showed that the intensity of selection for the two-stage case could be obtained in a similar manner as k = [z(τ*)]/p (Equation 6). Thus, suppose that indices I 1 and I 2 have a joint normal distribution and can be transformed into the standardized random normal variables T 1 = (I 1 − m 1 )/σ 1 and T 2 = (I 2 − m 2 )/σ with mean zero and variance 1, where m 1 and m 2 are the means, and σ 1 and σ 2 are the standard deviations of I 1 and I 2 , respectively. In this case, the method of selection is to retain animals or plants with T 1 ≥ c 1 at Stage 1 and T 1 + T 2 ≥ c 2 at Stage 2, where c 1 and c 2 -T 1 are truncation points for I 1 and I 2 , respectively. Thus, according to Young (1964), the selected population has the bivariate left truncated normal distribution with probability density function given by ℎ 1 , 2 (τ 1 , τ 2 ) = 1 , 2 (τ 1 , τ 2 ) where 1 , 2 (τ 1 , τ 2 ) = [ τ 2 1 + τ 2 2 − 2ρ 2 12 τ 2 1 τ 2 2 ] } and ρ 12 is the correlation between T 1 and T 2 . The total proportion selected (p = q 1 q 2 ) can now be written as where c 1 and c 2 -τ 1 are truncation points for I 1 and I 2 , respectively. It is evident that when the number of stages increases, the number of variables in 1 , 2 (τ 1 , τ 2 ), and the number of integrals in p also increases.
By the foregoing results, in the multi-stage selection context it is usual to fix a proportion (p) before selection is carried out and then, based on the fixed p value, to determine the proportion q i (i = 1, 2, . . . , S) for each stage under the restriction where S is the number of stages. In a two-stage selection context, p = q 1 q 2 . It is assumed that p is known but q i is unknown. One way of solving the equation p = q 1 q 2 is by trial and error, depending on the importance of each trait (Hazel & Lush, 1942) or each index at each stage. Cerón-Rojas et al. (2019a, 2019b have given a general method to estimate the OMLPSI selection intensities in the two-stage context, whereas Xu and Muir (1992) have provided a general method to obtain the selection intensities for the DMLPSI. The methods to obtain the selection intensities described above are valid for any multi-stage index. Until now, there has been no general procedure to obtain the OMLPSI selection intensity for any number of stages.

4.19
LGSI Togashi et al. (2011) originally proposed the LGSI, which is a linear combination of GEBVs; however, Ceron-Rojas et al. (2015) and Cerón-Rojas and Crossa (2019) developed the complete unconstrained and contrained LGSI theory. These authors designed the LGSI for selecting genotypes in inbred populations; that is, LGSI exploits the linkage disequilibrium between markers and quantitative trait loci (QTL) produced when inbred lines are crossed to select parents for the next selection cycle. This index uses all available marker information in the prediction because it obtains the GEBV by multiplying the genomic best linear unbiased predictor of all estimated marker effects in the training population by the marker code values of the testing population (VanRaden, 2008).

The genomic breeding values
Let u j be an m × 1 vector of QTL additive effects associated with markers that affect the jth trait, and let L be a matrix n × m (n = number of individual and m = number of markers in the population) of coded marker values (e.g., 2 -2q, 1 -2q, and -2q for the marker genotypes AA, Aa, and aa, respectively); then = is the vector of individual genomic breeding values associated with the jth characteristic (j = 1, 2, . . . , t; t = number of traits) of the candidates for selection. We assumed that α j has multivariate normal distribution with mean 0 and covariance matrix Gσ α 2 , where σ α 2 is the additive genomic variance of α j ; G = LL′|κ is the matrix of additive genomic relationship, κ = ∑ =1 2 (1 − ) in an F 2 population; q l is the frequency of allele A; and 1 − q l is the frequency of allele a for the lth marker (l = 1, 2, . . . , m).

The net genetic merit and its relationship with the genomic breeding values
The net genetic merit is related to the vector of genomic breeding values (Equation 17) as where β′ = [β 1 β 2 . . . β t ] is the vector of regression coefficients, α′ = [α 1 α 2 . . . α t ], and e is the residual with normal distribution, null expectation, and variance σ e 2 . It is possible to show that the covariance between β′α and e equals zero.

4.19.3
The LGSI The BLP of H is the conditional expectation of H given α, that is where w′Γ = cov(H, α)′ is the covariance between H and α, Γ −1 is the inverse matrix of the genomic covariance matrix Γ = var(α) = cov(α,g), w is as defined in Equation 2, and I G = w′α is the LGSI.

Selection response and expected genetic gain per trait
By Equations 8-10, the LGSI selection response and expected genetic gain per trait are, respectively, We have previously defined all parameters of Equation 20. To simplify notation, in Equation 20 we omitted the intervals between selection cycles, which denote the time required for LGSI to complete one selection cycle (Ceron-Rojas et al., 2015). Using the Cauchy-Schwarz inequality, Cerón-Roja and Crossa (2020b) showed that in the ssymptotic context, when the number of markers and genotypes tend to be infinite, = √ ' is the the upper boundary for the unconstrained LPSI selection response. It is possible to show that, in the constext of the constrained LPSI selection response, there is a similar result (Cerón-Roja & Crossa, 2019).
In the non-asymptotic context, when the number of markers and genotypes is low, matrices C and P can be written as C = Γ + Ξ and P = (Γ + Ξ) + R, respectively, where Ξ = C − Γ. Thus, the inverse of matrix P is This last equation indicates that in the non-asymptotic context, R G and R are related in three possible ways: By Points 2 and 3, in the non-asymptotic context, R G may be equal to or larger than R depending on the size of w′Ξw and w′[(Γ + Ξ) −1 + R −1 ] −1 w.

COMBINED LINEAR SELECTION INDICES
A combined LSI to predict H (Equation 2) is a linear combination of phenotypic values and marker scores (LMSI) or phenotypic values and GEBVs (called combined LGSI). Both indices combine information on markers linked to QTL and phenotypic values of the traits in the prediction of H because it is impossible to identify all QTL affecting the economically important traits (Dekkers, 2007;Dekkers & Settar, 2004;Lande & Thompson, 1990;Li, 1998). The LMSI assumes that favorable alleles and their average effects on phenotype are known; however, this assumption is only valid for traits conditioned by major genes and not for quantitative traits that are influenced by the environment. This is important because many QTL with small effects could interact with the environment and among themselves (Heffner et al., 2009;Hospital et al., 1997). Linear molecular selection index efficiency depends on various factors, such as number and density of markers associated with QTL, population size, trait heritability, additive genetic variances explained by markers, and the precision of the estimated effect of gene substitution (Dekkers & Dentine, 1991). Several authors (Gimelfarb & Lande, 1994Zhang & Smith, 1992, 1993 have pointed out the effectiveness of the LMSI in inbred populations with large population sizes and traits with low heritability values when only one trait and its associated molecular score are considered. Moreau et al. (2000Moreau et al. ( , 2007 found that the LMSI was more effective than LPSI only in early-generation testing and that LMSI had the additional disadvantage of increased costs due to molecular marker evaluation. In the genomic selection context, Dekkers (2007) developed the "combined LGSI" as a possible solution to LMSI problems. This index uses GEBVs (no marker scores) and phenotypic values jointly to predict H and is very similar to the LMSI. We describe the LMSI, the combined LGSI, and the combined optimum and decorrelated multi-stage LGSI (OMLGSI and DMLGSI, respectively) developed by Cerón-Rojas and Crossa (2020a).

5.1
The LMSI molecular score Suppose that the QTL effects combine additively both within and between loci; then the h th unobservable genetic value G h can be written as where ϑ k is the effect of the QTL k th , υ k is the number of favorable alleles at the QTL k th (2, 1, or 0), and N is the number of QTL affecting the j th trait of interest. Because the QTL effect values are not observable, the G h values are also not observable; however, we can use a linear combination of the markers linked to the QTL (s h ) that affect the j th trait to predict G h as

Crop Science
where s h is a predictor of G h , b k is the regression coefficient of the linear regression model, l k is the coded value of the k th markers (e.g., 1, 0, and −1 for marker genotypes AA, Aa, and aa, respectively), and M is the number of selected markers linked to the QTL that affect the j th trait. The above equation is called the "marker score" (Lande & Thompson, 1990). The number of selected markers is only a subset of potential markers linked to QTL in the population under selection; thus, the the variance of the s h values should be lower than or equal to the variance of the G h values.

The combined LSI
The combined LSI is where φ can be the vector of marker scores, s′ = [s 1 s 2 . . .

Vector of coefficients, maximized selection response, and correlation
Let var(s) = S be the covariance matrix of marker score and Γ as defined in Equation 19; then, according to the LPSI theory, the vector of coefficients that maximizes the combined LSI selection response, expected genetic gain per trait, and correlation is is the combined LGSI vector of coefficients. Vector v′ = [w′ 0′] is the vector of economic weights, where 0 is a null vector of size t × 1. The other parameters were previously defined.

By Equation 23, the maximized LMSI and combined LGSI selection response of both indices is
where k is the selection intensity. When S and Γ are null matrices, β y = b = P −1 Cw (the LPSI vector of coefficients), whereas R C = R (the LPSI selection response). In addition, when the number of markers and genotypes increases, matrix Γ tends to matrix C and at the limit Γ = C, from where β y = 0, β C = w, and Equation 24 is equal to = √ ′ (Equation 20). That is, the weights and selection response of the LGSI and the combined LGSI will be equal.

The combined LGSI expected genetic gain per trait and correlation
The maximized combined LGSI expected genetic gain per trait (E C ) and correlation (ρ ) are, respectively, where k is the selection intensity. Note that the denominator of ρ (Equation 25) and the denominator of the maximized LPSI correlation (Equation 11) are very similar; however, the numerator of both equations is different. We would expect that √ ′ ≥ √ ′ , and then ρ ≥ ρ .

Asymptotic relationship between combined LGSI and LGSI parameters
This means that in the asymptotic context, the CLGSI expected genetic gain per trait is twice the LGSI expected genetic gain per trait. Of course, 2 is only a proportionality constant; thus, in reality, = . Similarly, when Crop Science Γ = C and β′ = [0′ w′], β′Oβ = w′Γw and w′Cw = w′Γw in ρ (Equation 25); thus, the maximum correlation between H and I C in the asymptotic context will be equal to 1.0, as we would expect.

Combined multi-stage LGSIs
Based on the LMSI, Xie and Xu (1998) developed a decorrelated multi-stage LMSI. They assumed that their index would increase the efficiency of molecular assisted selection in a two-stage context by first (Stage 1) selecting immature seedlings (embryos) based on molecular score information only, and later, at Stage 2, by selecting individual traits based on molecular score and phenotypic value information jointly. The Xie and Xu (1998) index has the same LMSI problems as previously indicated. For this reason, Cerón-Rojas and Crossa (2020a) adapted the Dekkers (2007) index to the multistage context using the OMLPSI and DMLPSI approaches. Both these indices use GEBVs and phenotypic information jointly to predict the net genetic merit; then they are free of the problems of the Xie and Xu (1998) index. In the multistage context, the Cerón-Rojas and Crossa (2020a) indices are as follows: at Stage 1, we can select immature (seedlings, embryos) based on GEBVs information only and, in Stage 2, we select individual traits based on GEBVs and phenotypic values jointly. This is the Xie and Xu (1998) idea but in the genomic selection context. The approach based on the OMLPSI optimum index will be called "optimum combined multi-stage LGSI" (OCMLGSI), whereas the index based on the DMLPSI will be called "decorrelated combined multi-stage LGSI" (DCML-GSI) because, at stage two, both indices use GEBVs and phenotypic information jointly to predict the net genetic merit.

5.8
The OCMLGSI and DCMLGSI parameters at Stage 1 The LGSI and LPSI are part of the OCMLGSI and DCML-GSI. At stage-one the estimator of the LGSI (̂= w 1̂1 + w 2̂2 + ⋅ + ŵ, wherê=̂is the vector of GEBV of trait j th ) is used to predict H because at this stage, both indices select immature seedlings (or embryos) based only on GEBV information. The OCMLGSI and DCMLGSI maximized selection responses are, respectively, where 1 and 1 (the selection intensities for each index) are the only difference between 1 and 1 . The maximized correlation between H = w′g and I G = w′α is ρ where √ ′ is the standard deviation of the variance of I G = w′α, σ = √ ′ is the standard deviation of H = w′g, and C is the covariance matrix of g.

Covariance matrices for OCMLGSI and DCMLGSI
The main difference between the multi-stage LPSIs described earlier (OMLPSI and DMLPSI) and OCMLGSI and DCMLGSI is that, in addition to y′ = [Y 1 Y 2 . . . Let g′ = [G 1 G 2 . . . G t ] be a vector of true unobservable random breeding values for t traits, and let α, y, and ω 1 be as defined earlier. The covariance between α j and G j (j = 1, 2, . . . , t) is equal to the variance of α j [i.e., cov(α j , G j ) = σ α 2 ] (Dekkers, 2007); the covariance matrix of α is var(α) = Γ, and the covariance matrices between ω i (i = 1, 2, . . . , S) and g for S stages is where at Stage 1, ω 1 = α, cov(ω 1 , g) = Γ is the genomic covariance matrix, and from Stage 2 to S, cov(ω i , g) = C i is the i th genotypic covariance submatrix of Ω Γ . Note that while the size of matrix Γ is t × t, the size of matrix C i will be n i × t, where n i is the size of vector ω i . For example, if g′ = [g 1 g 2 g 3 g 4 ], ω′ 1 = [α 1 α 2 α 3 α 4 ], and ω′ 1 = [Y 1 Y 2 ], then Γ will be of size 4×4, and C 2 will be of size 2×4, where 2 is the number of rows and 4 is the number of columns in matrix C 2 . By the foregoing results, the covariance matrix of η′ where all the gamma matrices are genomics matrices, while the others are phenotypic covariance matrices. Matrices Q ii

Crop Science
and Λ i are constructed as which are submatrices of ∑ and Ω Γ , respectively, until Stage i.
The only difference between OCMLGSI and DCMLGSI is the way vector β′ 2 = [β′ G β′ y ] is obtained.

5.11
The OCMLGSI parameters at Stage 2 The OCMLGSI vector of coefficients (β 2 ) that maximizes the response, correlation, and expected genetic gain per trait where 2 is the OCMLGSI intensity at stage two. The total selection response for Stages 1 and 2 is = 1 + 2 .

5.12
The DCMLGSI parameters at Stage 2 The process to obtain the DCMLGSI vector of coefficients is similar to that described in subsection "The DMLPSI vector of coefficients at Stage i," where we described how to obtain the DMLPSI vector of coefficients (Equation 16). Subsequently, in this subsection we will only provide the main results of the DCMLGSI for Stage 2. Additional details associated with DCMLGSI are in Cerón-Rojas and Crossa (2020a). The DCMLGSI vector of coefficients at Stage 2 is where Equation 29); and I 2 is an identity matrix of the same size as matrix Q 22 , S 21 = Q 21 B 1 ; and S′ 21 = S 12 = B′ 1 Q 12 is the matrix of constraints similar to the matrix defined in the DMLPSI context. Matrix K 2 transforms the OCMLGSI vector of coefficients (β 2 ) into Equation 31 and is the only difference with respect to Equation 29. The maximized DCMLGSI selection response and correlation for two stages can be written as in Equation 30 by changing β 2 = Q 22 −1 Λ 2 ν* by Equation 31 and the OCMLGSI selection intensity ( 2 ) by the DCMLGSI selection intensity ( 2 ). Finally, note that at Stage 2 B′ 1 = [w′ 0′ S−1 ], where w = b 1 is the vector of economic weights and 0′ S−1 is a null vector of size S − 1. The size of matrix Q 12 depend on the number of traits selected at Stage 2.

CONSTRAINED LINEAR SELECTION INDICES
By imposing null restrictions on Equations 10, Kempthorne and Nordskog (1959) developed the null restricted LPSI (denoted RLPSI) in the single-stage context, which imposes restrictions equal to zero on the trait expected genetic gain. Later, Mallard (1972) generalized the RLPSI and developed an optimum constrained LPSI (CLPSI), which imposes restrictions different to zero and includes the RLPSI as a particular case. Both indices mainly affect the expected genetic gain per trait (Equation 10). Whereas in the single-stage context the CLPSI vector of coefficients is a linear combination of the LPSI vector of coefficients (β = P −1 Cw), in the multi-stage context, the constrained optimum multi-stage LPSI (COMLPSI) vector of coefficients is a linear combination of the OMLPSI vector of coefficients (Equation 14). In turn, the CDMLPSI vector of coefficients is also a linear combination of the OMLPSI vector of coefficients.

CLPSI main objective and Mallard matrix
Assume that μ j is the population mean of the j th trait before selection. The main objective of the CLPSI is to change μ j to μ j + d j , where d j is a predetermined change in μ j imposed by the breeder on Equation 10. Let 1, 2, . . . , r) is the j th element of vector d′ = [d 1 d 2 . . . d r ], and let U′ be a matrix (t − r) × t (t = number of traits; r = number of restricted traits) of 1s and 0s, where 1 indicates that the traits are restricted, and 0 indicates that the traits are unrestricted. The size of matrix U′ depends on the number of restricted traits.

Matrix U′
We can construct matrix U′ as follows. Suppose that we restrict one of the t traits; then, we can restrict the first of them as U′

The CLPSI vector of coefficients
Let M′ = N′Ψ′ be a Mallard (1972) where K R = [I − D], D = P −1 Ψ(Ψ′P −1 Ψ) −1 Ψ′, P −1 is the inverse of matrix P, and I is an identity matrix t × t. If N = U and U is a null matrix, then b R = β (the LPSI vector of coefficients). This means that the CLPSI is the most general index and it includes the RLPSI and LPSI as particular cases.
Using restriction U′C′β = θ*d, Tallis (1985) showed that Equation 32 can be written as ). In addition, when θ* = 1.0, Equation 34 is equal to b T = b R + δ*, which is not an optimum CLPSI (Mallard, 1972;Tallis, 1962). Itoh and Yamada (1987)  There are some problems associated with θ*. For example, the θ* values should be higher than zero and θ* ≠ 1.0 because when the value of θ* is negative, the CLPSI moves the population trait means in the opposite direction to the desired direction (Itoh & Yamada, 1987). Problems associated with the RLPSI and CLPSI are (a) due to the constraints, the correlation between RLPSI and CLPSI with the net genetic merit decreases as the number of constraints increases and (b) the CLPSI may change the means in the opposite direction to the desired direction due to the opposite sign values between w and d.
According to Itoh and Yamada (1987), one possible solution to the above problems could be to use the desired gains LPSI theory (Brascamp, 1984;Itoh & Yamada, 1986) to make a selection. The most important aspect of the desired LPSI gains is that it does not require economic weights, which means that the covariance between I and H [Cov(H,I) = w′Cβ] is not defined. In addition, this index maximizes neither the correlation between I and H (ρ IH ) nor the selection response (Equation 9). Another problem with this index is that it is not a predictor of H = w′g. Thus, we think that the CLPSI is a better option to select than the desired gains LPSI.

6.2
The maximized CLPSI expected genetic gain per trait, correlation, and response By changing vector β = P −1 Cw in Equations 32, 33, or 34 in Equations 10-12, it is possible to obtain the maximized CLPSI expected genetic gain per trait, correlation, and selection response. That is, the LPSI and CLPSI parameters are equivalent, and the only difference is how the vector of coefficients is obtained.

COMLPSI
According to the single-stage CLPSI (Equation 32) and OMLPSI theory described above, to obtain the COMLPSI vector of coefficients at Stage i we need to minimize the mean squared difference between the net genetic merit H = w′g and the index  14. Thus, the COMLPSI is more general than the OMLPSI and includes the OMLPSI and the multi-stage null phenotypic restricted index as particular cases. Xie and Xu (1997) extended the DMLPSI to a constrained DMLPSI (CDMLPSI); however, their approach is based on the single-stage Tallis (1962) constrained index, which is not optimal. Based on the Mallard (1972) index (Equation 32), which is an optimal index, Cerón-Rojas et al. (2019b) developed an optimal CDMLPSI, which we will describe in this subsection. The main difference between the CDMLPSI and the COMLPSI is that the COMLPSI imposes only one restriction when it solves the LPSI equations to obtain its vector of coefficients, whereas the CDMLPSI imposes the additional restriction that the covariance among the CDMLPSI values between stages be zero. As we shall see, the CDMLPSI vector of coefficients is a linear combination of the OMLPSI vector of coefficients (Equation 14). Let I i−1 = δ′ i−1 ω i−1 and I i = δ′ i ω i be the CDMLPSI at Stages i − 1 and i, respectively, and let J′ i−1 = [I 1 I 2 . . . I i−1 ] be a vector of CDMLPSIs values until Stage i − 1 so that the covariance between I i = δ′ i ω i and J′ i−1 is null. To obtain δ i , we need to minimize the mean squared difference between H = w′g and transform the OMLPSI vector of coefficients (β i ) into the COMLPSI and the CDMLPSI vectors of coefficients, respectively. In addition, the CDMLPSI expected genetic gain per trait, selection response, and the correlation at Stage i are the same as those of the OMLPSI when replacing β i with = .

The single-stage constrained LGSI
In a manner similar to the single-stage CLPSI context (Equation 32), it can be shown (Cerón-Rojas & Crossa, 2019) that the difference between the unconstrained LGSI described above and the constrained LGSI (CLGSI) is the projection Matrices N and U were defined in Equation 32, whereas I t is an identity matrix t × t. Matrix K G is idempotent and projects the LGSI vector of coefficients (w = β) into a space that is smaller than the original space of w. Thus, while w is the vector of coefficients of the unconstrained LGSI, = is the CLGSI vector of coefficients. In the CLGSI context, the selection response and expected genetic gain per trait are the same as those described in Equation 20 when changing w for b G = K G w.

ESIM, RESIM, and PPG-ESIM
In the current review, we describe ESIM, RESIM, and PPG-ESIM, which were developed in the canonical correlation context, and do not use economic weights when predicting H = w′g and making selections. The other indices associated with ESIM (e.g., MESIM and GESIM) can be seen in Cerón-Rojas and Crossa (2018).

ESIM
The fundamental difference between ESIM and the LPSI is the interpretation of the vector w. Whereas w is a vector of economic weights (known and fixed) in the LPSI context, in ESIM w is fixed but unknown and can be estimated in each selection cycle as a linear combination of the ESIM vector of coefficients. Moreover, in ESIM, the breeding values (g) and the phenotypic records (y) are two sets of variables that should have maximum correlation (λ j ), from where the CCA theory is the base of ESIM. The ESIM index can be written as is the unknown index vector of coefficients, and t is the number of traits. In the CCA context, the vector of ESIM coefficients (b′ E ) is called the "canonical vector," whereas λ j is called the "canonical correlation" (Wilms & Croux, 2016). For the jth correlation (λ j ), the CCA theory allows us to write ESIM and H = w′g as = ′ and = ′ , respectively, where = −1 , and and λ j are obtained from Note that is the jth vector (j = 1, 2, . . . , t) of the multi-trait heritability matrix P −1 C and maximizes the correlation ρ HI = σ HI /σ H σ I described in Equation 9 (Cerón-Rojas & Crossa, 2018). Equation 38 is the ESIM fundamental results obtained by  and is the base for constructing RESIM, PPG-ESIM, MESIM, and GESIM. To select with ESIM, breeders should use the first eigenvector ( 1 ) of matrix P −1 C in = ′ 1 and the first λ 1 in the ESIM selection response.

Changing the signs and proportions of the ESIM vector of coefficients
Equation 38 can be written as ( −1 ) = λ 2 , where I = F −1 F is an identity matrix of size t × t (t = number of traits), and F = diag{f 1 f 1 . . . f t } is a diagonal matrix with values equal to any real number, except zero values. Thus, another way of writing Equation 38 is Matrices P −1 C and F(P −1 C)F −1 are similar, and both have the same eigenvalues but different eigenvectors (Harville, 2008). When the F values are only 1s, vector is not affected; when the F values are only −1s, vector will change its direction, and if the F values are different from 1 and −1, matrix F will change the proportional values of . In practice, is first obtained from Equation 38 and then multiplied by matrix F to obtain = ; that is, is a linear transformation of . Matrix F(P −1 C)F −1 is called "similarity transformation," and matrix F is called the "transforming matrix" (Cerón-Rojas & Crossa, 2018). Vector = can substitute , and in this case, the optimized ESIM index should be written as = ′ .

The maximized ESIM selection response, expected genetic gain per trait, and correlation
Changing β = P −1 Cw (Equation 4) for 1 (Equation 38) in Equation 12, the ESIM selection response can be written similarly as the LPSI response. The same is true for the maximized expected genetic gain per trait, while the ESIM correlation is λ 1 , the square root of λ 1 2 . Cerón-Rojas and Crossa (2018) showed the ESIM correlation could also be written as Equation 11.

Crop Science
There are some problems associated with ESIM. For example, when matrix P is not positive definite and matrix C is not positive semidefinite, the eigenvalues of Equation 38 could be negative, and, in such cases, ESIM should not be used to make selections. Nevertheless, Okamoto (1973) showed that when the number of observations was higher than the number of traits, P was symmetric and positive definite, and its eigenvalues were different with probability one. In turn, Gentle (2007) showed that when P and C had the same size, all eigenvalues of P −1 C existed and were finite. Thus, under the Okamoto (1973) and Gentle (2007) results, ESIM is a good option for making phenotypic selection in breeding programs when it is not possible to obtain vector w for the LPSI.

7.4
Summary of the ESIM statistical properties 1. The variance of I E and the covariance between H E and I E are the same. 2. The ESIM variance prediction error is (1 − λ 2 1 )σ 2 , where σ 2 is the variance of H E . 3. The relative effectiveness of I E in predicting H E is the ratio of (1 − λ 2 1 )σ 2 over σ 2 (i.e., 1 − λ 1 2 ); thus, the greater λ 1 2 is, the more effective I E is in predicting H E . 4. The mean squared effect of I E on H E or the total variance of H E explained by I E is σ 2 = λ 2 1 σ 2 , and the relative mean squared effect can be measured by λ 2 1 .
Cerón-Rojas and Crossa (2018) showed that all the above results are valid for RESIM, PPG-ESIM, MESIM, and GESIM. Note that the above ESIM properties are very similar to the LPSI described earlier. This is so because the only diference among the LPSI an ESIM is the method to obtain the vector of coefficients.

The constrained eigen selection index method
Similar to the CLPSI (Equation 32), the objective of the predetermined proportional gain ESIM (PPG-ESIM) is to fix r of t (r < t) traits by predicting only the genetic gains of (t − r) of them. In Equation 32, we gave details associated with the CLPSI that will be used in the subsequent subsections. The statistical bases and objectives of the PPG-ESIM are the same as those of the CLPSI described in Equation 32. That is, PPG-ESIM tries to change μ j to μ j + d j , where d j is a predetermined change in μ j . In the RESIM, d j = 0 (j = 1, 2, . . . , r), where r is the number of predetermined proportional gains.

The PPG-ESIM vector of coefficients
According to Cerón-Rojas et al. (2016a), the PPG-ESIM vector of coefficients that maximizes the PPG-ESIM selection response and expected genetic gain per trait is where matrices K E = [I t − Q E ] and Q E = P −1 M(M′P −1 M)M′ are the same as those obtained in Equation 32, I t is an identity matrix t × t, and λ P 2 and b p are the eigenvalue and the eigenvector of matrix K E P − C, respectively. When N = U, b p = b R (the vector of coefficients of RESIM), and when U′ is a null matrix, b p = b E (the vector of coefficients of ESIM). That is, PPG-ESIM is more general than RESIM and ESIM and includes the latter two indices as particular cases.

7.7
Relationships among ESIM, PPG-ESIM, LPSI, and CLPSI Cerón-Rojas et al. (2008a, 2016b described the relationships between the ESIM and PPG-ESIM vector of coefficients and the LPSI and CLPSI vector of coefficients as follows. When the LPSI and CLPSI vector of coefficients (β and b C = Kβ, respectively) are in the space generated by the eigenvectors of Equations 38 and 39, respectively, β and b C can be written as where α j * and a j * are constants that can be estimated by least squares, and and are the eigenvectors of Equations 38 and 39. Based on the Bessel inequality, Cerón-Rojas et al.
The REML method is based on projecting the data into a subspace free of fixed effects and maximizing the likelihood function in this subspace (Blasco, 2001). Thus, because(̂− 1 ) and̂are REML estimators, by the invariance property of maximum likelihood estimators (Rencher & Schaalje, 2008),̂,̂,ρ , and̂are maximum likelihood estimators and they are minimum variance and asymptotic unbiased estimators for β, R, ρ HI , and E (Cerón-Rojas & Crossa, 2020b; Muirhead, 2005). Cerón-Rojas and Crossa (2020b) showed that, in effect,̂andρ are asymptotic unbiased estimators and these authors described methods to obtain confidence intervals for the expetations of̂andρ . The REML estimators can also be used in the CLPSI, ESIM, combined LGSI, multi-stage LSI, etc.

Numerical example: LPSI vs ESIM
We compare the ESIM efficiency versus LPSI efficiency using a real data set from commercial egg poultry lines obtained from Akbar et al. (1984). The estimated phenotypic (̂) and genetic (̂) covariance matrices among the rate of lay (RL, number of eggs), age at sexual maturity (SM, days), = 1, the best way of comparing ESIM results versus LPSI results is when the LPSI coefficient vector is normalized (i.e., when the LPSI coefficient vector is equal tô * =̂∕̂′̂and then̂′ * ̂ * = 1); however, it can be shown that the normalization process only affects the estimated LPSI selection response because in that case,̂= 74.91 is divided bŷ′̂. For example, for this data set result,̂′̂= 15.76; then, the estimated LPSI selection response usinĝ * =̂∕̂′̂will bê= 74.91 15.74 = 4.75, while the rest of the estimated LPSI parameters will be the same. When 0 <̂′̂< 1 and 1 <̂, the values of̂will increase, but when 1 <̂′̂, the values of̂will decrease, as in this example. The product̂′̂does not affectρ because it is invariant to scale change. Also,̂′̂does not affect̂becausê ′̂a ppears in the numerator and denominator of both estimated parameters.
In ESIM, the sign and proportion of the expected genetic gain values for traits RL, SM, and egg weight should be according to the breeder's interest. For example, if the breeder's interest is that the expected genetic gain per trait for RL be positive and negative for SM, the sign and proportion of the values of the first eigenvector should be modified using a linear combination of̂1 as described earlier (i.e., 1 =̂1 ) in order to achieve expected genetic gain per trait values in RL and SM.
The information needed to obtain the estimated ESIM parameters is matrix̂− 1̂. We need to find the eigenvalues and eigenvectors of the equation tively. Because the estimated LPSI selection response waŝ = (74.91∕15.74) = 4.75, the estimated ESIM selection response was higher than the estimated LPSI response for this data set. Now suppose that the breeder's interest is to increase RL and decrease SM; then̂′ is a good result but̂′ is wrong. We can change the sign and proportion of̂′ by transforminĝ 1 aŝ1 =̂1 using a convenient matrix F such as  (Harville, 2008), the estimated maximized ESIM correlation (λ 1 = 0.6782) should not be affected by matrix F.

RIndSel
RIndSel is a powerful freely available Java software useful for estimating the LPSI, LGSI, CLPSI, ESIM, etc., parameters and selecting individual candidates as parents for the next selection cycle. It is available in the CIM-MYT (for its Spanish acronym, or International Maize and Wheat Improvement Center) RIndSel Selection indices for plant breeding repository at https://data.cimmyt.org/ dataverse/root?q=rindsel. RIndSel is compatible with Windows XP, 7, 8, and 10 and can be installed on 32-bit and 64bit computers. To input the data information, the user must create an Excel file (saved as a comma-delimited ".csv") with the data information to be analyzed, from where the software reads that information and estimates the index parameters.
In the linear mixed model context (Searle et al., 2006), RIndSel uses REML and two experimental designs (a randomized complete block design and a lattice or alpha lattice design) to estimate the phenotypic and genotypic covariance matrices. This software presents the results of the estimated index parameters in two Excel files (e.g., alloutSmith.csv and outSmith.csv, where, in this case, "Smith" denotes the LPSI used for the analysis) and one text file (outextSmith.txt). File "alloutSmith.csv" contains all the trait adjusted means for all genotypes, and the estimated index values, whereas file "outSmith.csv" contains the genotypes and the trait values selected with the index values after these have been ranked. File "outextSmith.txt" contains all information associated with the estimated index parameters such as the estimated phenotypic and genotypic covariance matrices, the estimated selection response, etc. A complete user manual with instructions on how to use RIndSel in plant breeding is available at https://data.cimmyt.org/dataset.xhtml?persistentId=hdl: 11529/10854.

7.11
Summary of the main results of this review In this manuscript, we have summarized and expanded upon the results of the following authors: 1. Smith (1936) established the LPSI theory on which all LSIs described in this work are based. 2. Cochran (1951) was the first to indicate that the LPSI is the BLP of the net genetic merit and was the first to develop the two-stage optimum LPSI. 3. Kempthorne and Nordskog (1959) were the first to develop the RLPSI. 4. Young (1964) combined the LPSI theory with the culling selection method and developed the OMLPSI in the multistage context, from where Cerón-Rojas et al. (2019b) developed the COMLPSI and Cerón-Rojas and Crossa (2020a) developed a COMLGSI. 5. Mallard (1972) developed the first optimum constrained LPSI, which allows imposing constraints diferent to zero on the expected genetic gain per trait and includes the RLPSI as a particular case. 6. Xu and Muir (1992)  This work complements the classic characterizations of the LSI made by several authors (Bulmer, 1980;Cochran, 1951;Henderson, 1963) and indicates that the LSI theory is a general mathematical-statistical framework that should allow breeders to confidently use it in plant and animal breeding programs.

CONCLUSIONS
Populations of plants or animals are sets of individuals that, in turn, are sets of phenotypic characteristics, such as color, weight, height, etc. Thus, when breeders select parents in a breeding program, they select individuals, not single characteristics, and the selection of any set of traits on the individuals changes not only the mean of the individual's unselected traits but also the correlation between selected and unselected Crop Science traits (Pearson, 1903). Then, in breeding programs, breeders should select parents using LSI because the prediction procedures with univariate models do not contemplate the genetic correlations between traits. In practice, evaluating plants or animals requires several simultaneous traits. For example, breeders studying yield and grain quality register phenotypic data that include yield components (e.g., grain weight or biomass), grain quality (e.g., flavor, shape, color, nutrient content), and resistance to biotic and abiotic stresses (Jia & Jannink, 2012). We have described the LSI statistical theory assuming that the net genetic merit and the vector of phenotypic values have multivariate normal distribution. Under this assumption, and when the phenotypic and genotypic covariance matrices are known, the LSIs are particular cases of the BLP theory. Furthermore, we found that regardless of the LSI characteristics (single-stage or multi-stage, unconstrained or constrained, etc.), the LSI parameters (selection response, expected genetic gain per trait, and correlation) are the same.
The BLP theory provides the mathematical-statistical framework to develop the LSI theory. Thus, the LSI is a particular case or an application of the BLP to plant and animal breeding selection. This result allows breeders to locate the LSI theory in a general mathematical framework and then confidently use it when making selections. Nevertheless, some problems of the LSI not discussed in this work are associated with the estimation of the LSI parameters. For example, when the phenotypic and genotypic covariance matrices are estimated, the LPSI could not be BLP; indeed, it is not linear in y (Searle et al., 2006). Nevertheless, Cerón-Rojas and Crossa (2020b) found that the LPSI is a good predictor of the net genetic merit including when REML estimates the phenotypic and genotypic covariance matrices. Because of this, breeders should use the LSI theory in plant breeding with confidence.